Part Context Learning for Visual Tracking
نویسندگان
چکیده
Context information is widely used in computer vision for tracking arbitrary objects[1, 3, 4]. Global context cannot deal with the object deformation problem, while the local part context interactions are relatively stable. When the target appearance changes gradually, the intrinsic property of internal interaction between the parts inside object and context interaction between object and background are relatively stable in spatiotemporal 3D space of tracking. To explore the structure property and stable relationship for overcoming complex environments, we propose a novel part context tracker. The Part Context Tracker (PCT) consists of an appearance model, an internal relation model and an context relation model. The internal relation model formulates the temporal relations of the object itself or the in-object parts themselves and the spatio-temporal relations between the object and inobject parts. The context relation model constructs the spatio-temporal relations between the in-object parts and the context parts and the temporal relations of the context parts themselves. Hence the physical properties and the appearance information are considered in the optimization process through parts and relations. The contributions are as follows: (1) We first propose a unified context framework which formulates the single object tracking as a part context learning problem. (2) The in-object parts and context parts are selected so that we not only pay attention to the appearance of object, but also focus on the relations among the object, the in-object parts and the context parts. (3) A simple yet robust update strategy using median filter is utilized, thereby enabling the tracker to deal with appearance change effectively and alleviate the drift problem. Our framework not only models the object with in-object parts, but also incorporates the interaction between the object and background with context parts. The deformable configuration [2, 5] together with the temporal structure of these parts are also considered in. In Fig. 1, with the object bounding box as the root R, the in-object parts I are defined as the parts selected inside R, which covers part of the object appearance. The context parts C are selected from the overlapping area between the object and the background. For a target with K in-object parts and M context parts, the configuration is denoted as B = (B0,B1...BK ,BK+1, ...,BK+M). Where B0 stands for the target bounding box R, (B1, ..,BK)∈ I are the K in-object part boxes, and (BK+1, ...,BK+M)∈ C are the M context part boxes. The corresponding features of the root and parts are represented as X = (x0, ...,xK ,xK+1, ...,xK+M). In a word, our framework models the object with three components:
منابع مشابه
Eye-Tracking Method’ Usage for Understanding the Cognitive Processes in Multimedia Learning
Introduction: Designing multimedia learning environments should consist of the evidence-based study and principals about the human learning process. Eye tracking is a way based on the learner processing of learning materials which presented in multimedia learning environments. The aim of the study was to examine the use of the eye-tracking method to investigate the cognitive processes in m...
متن کاملP58: Visual Working Memory Performance Based on Saccades in Children with and without Specific Learning Disorder: An Eye-Tracking Study
Some of the previous studies show that children with SLD have deficits in visual processing and working memory. Hence, the aim of this research was to investigate problems of visual working memory based on behavioral neuroscience method, using an eye tracker device. The method of present study was ex-post facto study. The participants included couple of twelve children with SLD (mean age=10.92)...
متن کاملVisual Tracking using Learning Histogram of Oriented Gradients by SVM on Mobile Robot
The intelligence of a mobile robot is highly dependent on its vision. The main objective of an intelligent mobile robot is in its ability to the online image processing, object detection, and especially visual tracking which is a complex task in stochastic environments. Tracking algorithms suffer from sequence challenges such as illumination variation, occlusion, and background clutter, so an a...
متن کاملThesis for the degree Doctor of Philosophy
In this thesis we address two related aspects of visual object recognition: the use of motion information, and the use of internal supervision, to help unsupervised learning. These two aspects are inter-related in the current study, since image motion is used for internal supervision, via the detection of spatiotemporal events of active-motion and the use of tracking. Most current work in objec...
متن کاملComparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts
: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...
متن کاملThe Effect of Three Vocabulary Learning Strategies of Word-part, Word-card and Context-clue on Iranian High School Students’ Immediate and Delayed English Vocabulary Learning and Retention
The present study was an attempt to compare the effect of three VLSs, namely word-part strategy, word-card strategy and context-clue strategy on immediate and delayed English vocabulary retention of Iranian third grade high school students. To this end, 90 students, studying at three high schools in Tabriz, in three intact groups, were considered as the participants of the study. In order to en...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014